Chapter 8 Network Analysis
Written by Nicholas Martino and Paul Pickell
Networks are abstract structures commonly used to represent patterns of relationships among sets of various things (Ajorlou 2018). Such structures can be used to represent social connections, spatial patterns, ecological relationships, etc. In GIS, the elements that compose geospatial networks are geolocated – in other words: they have latitude and longitude values attached to them. Network analysis encompasses a series of techniques used to interpret information from those networks. This chapter introduces basic concepts for building, analyzing and applying spatial networks to real-world problems.
- Understand what networks are and to identify the elements that compose them
- Categorize different types of networks according to their topologies
- Create spatial networks and learn how to apply them in various applications
- Extract relevant information from spatial networks about the relationship between their elements, such as routes, distances and centralities
8.1 Introduction to Graph Theory
Graphs are the abstract language of networks (Systems Innovation 2015a). Graph theory is the area of mathematics that study graphs. By abstracting networks into graphs, one is able to measure different kinds of indicators that represents information about relationships that exist within a certain system. Why abstracting real-world elements into networks can be useful? Network analysis facilitates the study of data sets that demand information about their behaviour in terms of connectivity, flows, direction or paths. This is especially useful to understand the behaviour of complex adaptive systems such as societies, cities, ecosystems, etc. All graphs are composed of two parts: nodes and edges (or links).
8.2 Nodes
A node (or vertex) may represent any thing that can be connected with other things. For example, it can represent people in social networks, street intersections in road networks, or chemical compounds in molecular networks, among others.
8.3 Edges
Edges (or links), on the other hand, represent how vertices are interconnected to each other. So it may represent the vertices’ social connections, street segments, molecular bindings, etc. The graph below represents rapid and frequent transit lines in Metro Vancouver. Each node represents a transit line and the edges represents connections between those lines.
Figure 8.1: Graph representing connection between Metro Vancouver rapid and frequent transit lines. Interactive figure in the online format of the textbook.
Figure 8.2: Rapid and frequent transit network in Metro Vancouver (TransLink 2020).
8.4 Connectivity and Order
There are two major types of connections within the graphs: directed and undirected. Connections are directed when they have a specific node of origin and destination.
8.5 Direct
Directed graphs are networks where the order of elements change relationships between them. We represent directed connections with an arrow. The network below represents relationships between characters of Les Miserables. For example, in the case of the transit network we could use a directed graph to represent the path one has to take in order to shift from one line to another.
Figure 8.3: Example of directed graph for social relationships. Interactive figure in the online format of the textbook.
8.6 Undirect
On the other hand, in an undirected graph, connections are represented as simple lines instead of arrows. The order of elements does not matter.
8.7 Network Topologies
Topology is the study of how network elements are arranged. The same elements arranged in different ways can change the network structure and dynamics. A very common example is the arrangement of computer networks.
Figure 8.4: Abstract examples of network topologies (Wikibooks 2018).
8.8 Physical vs. Logical Topology
In GIS we use networks to represent spatial structures of various kinds. While all networks can be represented in an abstract space - this is, without a defined position in the real-world - some network analysis might be more useful when we attach physical properties to them, such as latitude and longitude coordinates. We call logical topology the study of how network elements are arranged in this abstract space. On the other hand, physical topology refers to the arrangemet of networks in the physical space. We can then classify “types” of networks according to the way their nodes is arranged.
8.9 Non-Hierarchical Topologies
8.9.1 Lines
Lines are when nodes are arranged in series where every node has no more than two connections, except for the two end nodes. A rail transit line, for example, can be represented as a line network. The map below portrays the SkyTrain Millenium Line in Vancouver. Each node represents a stop and the lines the connections between those stops.
Figure 8.5: Rail transit line in Vancouver (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.9.2 Rings
Rings are similar to lines except that there are no end nodes. So each and every node has two connections and the “first” and “last” nodes are connected to each other forming a circle. The spatial structure of the Stanley park seawall trail in Vancouver resembles a ring. In this example, nodes stand for intersections and view spots and edges are the connections between these spots along the seawall.
Figure 8.6: Ring of the Stanley Park seawall (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.9.3 Meshes
In a mesh, every node is also connected to more than one node. However, in this case nodes can be connected to more than two nodes. Connections in a mesh are non-hierarchical. Contrary to rings and lines where there is only one possible route from one node to another, in a mesh there are multiple routes to access other nodes in the network. A common way to generate a mesh network is using Delaunay triangulation, where nodes are connected in order to form triangles and maximize the minimum angle of all triangles (Wikimedia 2021a). Mesh configurations are commonly used in decentralized structures such as the internet.
Figure 8.7: Tree canopy mesh (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.9.4 Fully Connected
As the name suggests, in fully connected networks every node is connected to every other node. The graph representing all possible origin-destination commutes among Metro Vancouver municipalities is a type of fully connected network.
Figure 8.8: Possible origin-destination commutes between municipalities within Metro Vancouver (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.10 Hierarchical Topologies
Different from non-hierarchical topologies, hierarchical configurations are structured around a central node or link. By looking into hierarchical topologies it becomes easier to understand the notion of depth. The more distant a node is from the central node or link, the more depth it has. Hover the mouse over the nodes in the following maps to check out their depth.
8.10.1 Stars
Stars are hierarchical structures where two or more nodes are directly connected to a central node. This concentric garden at the University of British Columbia can be represented according to a star topology.
Figure 8.9: Spatial structure of a concentric garden at UBC (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.10.2 Buses
Buses are structures where every path from one node to another passes through a central path or corridor. If we isolate a street segment from an urban street network, the connections between buildings and streets depict a bus topology.
Figure 8.10: Connections between houses and streets (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.10.3 Trees
In tree topologies, nodes are structured from a root node and arranged into edges that are similar to branches of a tree. This highly hierarchical structure create a sort of parent - child relationship amongst nodes. The spatial configuration of boat marinas are usually structures in tree-like topologies. By definition, all tree network structures will always have more than one terminal nodes (a node that only has one connection to the network).
Figure 8.11: Tree spatial structure of a boat marina (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.11 Spatial Network Analysis
Networks can then be arranged according to various different configurations. Aside from classifying networks into different types according to their topologies, some of the most useful features of spatial network analysis refers to how to extract information from these structures given certain parameters.
8.12 Network Tracing
The act of modelling spatial networks is called network tracing. When tracing a network it is important to bear in mind the direction with which information is added to the network, especially when this orientation information is important to further analyze flows and relationships within such structure. For example, when mapping hydrological networks to study its flows it might be useful to model the direction of streams coherently as this might be an important information to represent the dynamics of the network.
Figure 8.12: Graph representing Fraser River Flows (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.13 Linear Referencing
Linear referencing is a method of using geographic locations for measuring relative positions along a linear feature. In network analysis, linear referencing techniques can be used for finding the length of paths along the network (Ramsey 2012). In this method, the graph elements are defined in terms of their physical location and edges are used to calculate distances among parts of the network.
Figure 8.13: Example of using linear referencing to measure distance between points.
In the above figure we can see how linear referencing systems work. Considering one would like to measure distances from node a along a network link L, distance measures can be used to locate, for example, points b, c and d along line L.
8.14 Routing
One application of linear referencing is to find routes between nodes is an useful application of spatial networks. This is how mapping tools help us navigate the world by finding the most efficient route to move around the city, for example. (Systems Innovation 2015b).
8.15 Least Cost Paths
Usually multiple paths can be traced along a network to go from one point to another. The notion of cost allow us compare the degree of difficulty needed to cross such paths. With this information, it is possible to rank different routes. In spatial networks, cost usually relate to the necessary distance (either physically or logically) to go from a certain node to another, but they might also represent other aspects such as time, traffic, elevation, current flows, etc. For example, way finding tools that are commonly used to help us to locate and move around in the city usually takes into account multiple costs such as distance, traffic and/or elevation. The least cost path is the easiest way to go from one point of the network to another.
It can be found by associating the costs with elements of the network (either nodes or edges) and summarizing the total cost of certain routes. Usually linear referencing techniques are used to calculate costs by storing locations along measured linear features. The map below displays the least cost path in terms of physical distance (shortest path) between two points.
Figure 8.14: Least cost path (in terms of distance) between two points (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.16 Least Cost Corridors
Although most network analysis techniques are suitable for vector data, raster layers can also be analyzed. One application of using network analysis with raster data is the finding of least cost corridors. While least cost paths helps to find linear paths along the network between two points, least cost corridors are based on the overlay of two cost accumulative rasters.
8.17 Reach Analysis
Reach techniques are commonly used to find the incidence of defined elements within a certain radius from a chosen node. All possible routes are modeled. The number of terminal nodes varies according to the network structure. Urban walkability indices usually uses reach techniques to assess the intensity of certain indicators (such as intersection density or non-residential land uses) given a walkable radius (Martino 2020). In the map below we portray the network reach from a given origin point into the Pacific Spirit Regional Park within 400m (red), 800m (yellow), 1200m (green) and 1600m (blue) radii.
Figure 8.15: Reachable segments within multiple distance radii (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
8.18 Network Centrality
Nodes and edges of a graph can also be ranked in terms of how important they are to the overall network. Network centrality measures represent whether elements of a graph are more central or peripheral to the overall system. Such measures can then be interpreted as indicators of importance. Applications are endless. Centrality measures are used for ranking search engine pages (Wikimedia 2021b), for finding persons of interest in social networks (Ajorlou 2018) and for modelling movement in street network (Hillier et al. 1993). There are several centrality measures that serve to the most various purposes. Some of the most commonly used ones are Closeness and Betweenness centrality.
8.19 Closeness Centrality
Closeness centrality measures how close each node is to every other node of the graph in terms of topological distances. It highlights nodes located in easily accessible spaces. For example when analyzing the closeness of street intersections at the University of British Columbia (UBC), intersections in core streets such as the Main Mall, Agronomy Road and Northwest Marine Drive are ranked as highly central whereas more residential and segregated areas such as Acadia Road are ranked with lower closeness centrality.
Figure 8.16: Closeness centrality of the street intersections at UBC (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
Closeness is calculated based on the logical distance from one vertex to all the other vertices in the network. The formula for estimating closeness centrality of a vertex \(i\) is:
\(c_i = \sum\limits_{j} \frac{1}{d_{ij}}\)
where \(d_{ij}\) means the logical distance from \(i\) to \(j\).
8.20 Betweenness Centrality
Betweenness centrality measures how likely a node or an edge is to be passed through when going from every node to every other node of the graph. If we imagine agents travelling from each node to every other node and back, betweenness centrality would be the trail left by those agents. While closeness highlights central spaces, betweenness highlights pathways that lead to those central spaces. Using the same street network at UBC we can calculate the betweenness of segments.
Figure 8.17: Betweenness centrality of street segments at UBC (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
Betweenness is calculated based on the number of shortest paths (in logical distances) from all nodes to all other nodes. According to the documentation of the graph-tool software, betweenness of a vertex \(C_{B}(\upsilon)\) is defined as:
\(C_{B}(\upsilon) = \sum\limits_{s \neq v \neq t \in V} \frac {\sigma_{st}(v)}{\sigma_{st}}\)
where \((v)\) \({\sigma_{st}}\) represents the number of shortest paths from node \(s\) to node \(t\) and \({\sigma_{st}(v)}\) represents the number of those paths that pass through \(v\) (graph-tool n.d.). We can use centrality measures to evaluate how accessible certain spaces are from the point of view of their spatial structure with a broader system.
8.21 Case Study: Central and Peripheral Green Spaces in Vancouver
Are green spaces evenly accessible throughout the whole city? Which parks are topologically closer to the city as a whole? Centrality analysis of the street network can be used to answer these questions.
First we need to find the Closeness measure for the street network of the City of Vancouver. The network information was downloaded from OpenStreetMap. The graph-tool software was used to calculate the centrality measure. With the results of centrality for all street intersections in the city, we can overlay Parks & Green spaces data from the City of Vancouver Open Data portal and get the average closeness of nodes within each green space.
Figure 8.18: Closeness centrality of parks and green spaces at the City of Vancouver (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
## Warning: Removed 6 rows containing non-finite values (stat_density).
Figure 8.18: Closeness centrality of parks and green spaces at the City of Vancouver (City of Vancouver n.d.). Licensed under the Open Government License - Vancouver. Interactive figure in the online format of the textbook.
Results show parks located in the middle of the city have higher closeness than parks located at the edges. In other words, these parks are located in parts of the city that have easy access to the city’s street network as a whole. As the histogram leans towards the right, we can conclude that there are more parks with higher closeness than parks with lower closeness.
Practice & Reflection
Some questions to reflect and better understand the basic concepts and applications of network analysis explained in this chapter:
- Which types of behaviour can be modelled and understood using network analysis techniques?
- What is the difference between physical and logical distances?
- How different costs can be used for routing along spatial networks?
- How can network centrality measures be interpreted in spatial networks?